Semi-supervised Clustering for Word Instances and Its Effect on Word Sense Disambiguation
نویسندگان
چکیده
We propose a supervised word sense disambiguation (WSD) system that uses features obtained from clustering results of word instances. Our approach is novel in that we employ semi-supervised clustering that controls the fluctuation of the centroid of a cluster, and we select seed instances by considering the frequency distribution of word senses and exclude outliers when we introduce “must-link” constraints between seed instances. In addition, we improve the supervised WSD accuracy by using features computed from word instances in clusters generated by the semi-supervised clustering. Experimental results show that these features are effective in improving WSD accuracy.
منابع مشابه
Semi-supervised Learning by Fuzzy Clustering and Ensemble Learning
This paper proposes a semi-supervised learning method using Fuzzy clustering to solve word sense disambiguation problems. Furthermore, we reduce side effects of semi-supervised learning by ensemble learning. We set classes for labeled instances. The -th labeled instance is used as the prototype of the -th class. By using Fuzzy clustering for unlabeled instances, prototypes are moved to more sui...
متن کاملA Semi-Supervised Feature Clustering Algorithm with Application to Word Sense Disambiguation
In this paper we investigate an application of feature clustering for word sense disambiguation, and propose a semisupervised feature clustering algorithm. Compared with other feature clustering methods (ex. supervised feature clustering), it can infer the distribution of class labels over (unseen) features unavailable in training data (labeled data) by the use of the distribution of class labe...
متن کاملWord Sense Induction and Disambiguation Rivaling Supervised Methods
Word Sense Disambiguation (WSD) aims to determine the meaning of a word in context and successful approaches are known to benefit many applications in Natural Language Processing. Although, supervised learning has been shown to provide superior WSD performance, current sense-annotated corpora do not contain a sufficient number of instances per word type to train supervised systems for all words...
متن کاملWord Sense Disambiguation by Semi-supervised Learning
In this paper we propose to use a semi-supervised learning algorithm to deal with word sense disambiguation problem. We evaluated a semi-supervised learning algorithm, local and global consistency algorithm, on widely used benchmark corpus for word sense disambiguation. This algorithm yields encouraging experimental results. It achieves better performance than orthodox supervised learning algor...
متن کاملWord Sense Disambiguation in Hindi Language Using Hyperspace Analogue to Language and Fuzzy C-Means Clustering
The problem of Word Sense Disambiguation (WSD) can be defined as the task of assigning the most appropriate sense to the polysemous word within a given context. Many supervised, unsupervised and semi-supervised approaches have been devised to deal with this problem, particularly, for the English language. However, this is not the case for Hindi language, where not much work has been done. In th...
متن کامل